Frequent Pattern Queries: Language and Optimizations

نویسندگان

  • Francesco Bonchi
  • Dino Pedreschi
  • Chiara Renso
  • Mirco Nanni
چکیده

The objective of this thesis is to study data mining query optimization in a Logicbased Knowledge Discovery Support Environment. i.e, a flexible knowledge discovery system with capabilities to obtain, maintain, represent, and utilize both induced and deduced knowledge. In particular, we focus on frequent pattern queries, since this kind of query is at the basis of many mining tasks, and it seems appropriate to be encapsulated in a DBMS as a primitive operation. We introduce an inductive language for frequent pattern queries, which is simple enough to be highly optimized and expressive enough to cover the most of interesting queries. Then we study optimization issues, defining novel algorithms for constrained frequent pattern mining i.e., finding all itemsets included in a transaction database that satisfy a given set of constraints. Finally, all the contributions in both the linguistic part of the thesis and the algorithmic part, are amalgamated together in order to define an optimized constraint-pushing operational semantics for the proposed language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بهبود الگوریتم انتخاب دید در پایگاه داده‌‌ تحلیلی با استفاده از یافتن پرس‌ وجوهای پرتکرار

A data warehouse is a source for storing historical data to support decision making. Usually analytic queries take much time. To solve response time problem it should be materialized some views to answer all queries in minimum response time. There are many solutions for view selection problems. The most appropriate solution for view selection is materializing frequent queries. Previously posed ...

متن کامل

Type-based Semantic Optimization for Scalable RDF Graph Pattern Matching

Scalable query processing relies on early and aggressive determination and pruning of query-irrelevant data. Besides the traditional space-pruning techniques such as indexing, type-based optimizations that exploit integrity constraints defined on the types can be used to rewrite queries into more efficient ones. However, such optimizations are only applicable in strongly-typed data and query mo...

متن کامل

A Method for Protecting Access Pattern in Outsourced Data

Protecting the information access pattern, which means preventing the disclosure of data and structural details of databases, is very important in working with data, especially in the cases of outsourced databases and databases with Internet access. The protection of the information access pattern indicates that mere data confidentiality is not sufficient and the privacy of queries and accesses...

متن کامل

Frequent pattern mining under generalized subsumption

Frequent pattern mining (including the discovery of association rules) is an important task in data mining. Recently, there is increasing interest in mining relational databases. Up to now, most algorithms have focussed on a syntactical approach. However, the use of background knowledge would greatly improves the quality of the results. First, patterns and rules which are not equivalent from a ...

متن کامل

Optimizing a Sequence of Frequent Pattern Queries

Discovery of frequent patterns is a very important data mining problem with numerous applications. Frequent pattern mining is often regarded as advanced querying where a user specifies the source dataset and pattern constraints using a given constraint model. A significant amount of research on efficient processing of frequent pattern queries has been done in recent years, focusing mainly on co...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003